# Reduced Alphabe Representation

This is implementation of work done for the paper titled, "Identification of Phase Separating Proteins with Distributed Reduced Alphabet Representations of Sequences".

Dependecies:
- numpy
- pandas
- Biopython
- Gensim
- Scikit-learn

To use this code for getting prediction follow below steps:
1. Change your working directory to the root folder.
2. Add full path for the fasta file containing sequences for which the prediction needs to be made as argument to the `get_prediction` function in `main.py` file.
```py
...
output = get_prediction("<path_to_fasta_file>")
...
```
3. Open terminal on the root folder.
4. Run below command for running the prediction script
```sh
python main.py
```

The ouput of the prediction will be printed on the terminal.